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Abstract 


Statistical  inference  based  on  ranks  is  reviewed.  The  role  of  the  para- 

A 

t  "2 

meter  y  •  Jf  (x)dx  and  methods  for  its  estimation  are  discussed.  In  particular, 
the  use  of  density  estimation  methods  is  shown  to  provide  a  consistent  estimate 


''without  the  assumption  of  symmetry  of  the  underlying  distribution.  The 


use  of/^  in  constructing  hypothesis  tests  in  the  linear  model  without 
assuming  symmetry  is  discussed. 


2 


1.  Introduction 


Suppose  we  have  a  random  sample  X^ . from  a  distribution  with 

cumulative  distribution  function  (cdf)  F(x-0).  Further,  suppose  F(*) 

possesses  a  symmetric  density  f(*).  Let  <J>+(u) ,  0<u<l,  be  a  nondecreasing, 

1  +  2 

[<p  (u)  ]  du*l.  Let 


square- integrable  function  standardized  such  that 

+  +  J° 
R^  ,  ...  be  the  ranks  of  the  absolute  values  | X^| 


X  ;  then 

1  n1 


+ 

S 


n  R 


si8n(x1> 


(l.D 


is  a  rank  test  statistic  for  testing  Hq:  9=0.  The  distributional  properties 

of  S+  were  studied  in  great  detail  by  Hajek  and  Sidak  (1967).  The  test  rejects 

Hq:  6=0  in  favor  of  HA:  0j*O  if  |S+|  >  c,  where  c  may  be  determined  from  the 

permutation  distribution  of  S+  or  a  normal  approximation.  Under  Hq :  0=0, 

-1/2  +• 

n  S  has  an  asymptotic  standard  normal  distribution.  The  Hodges-Lehmann 

(1963)  estimate  of  9  is  an  approximate  solution  of  S+(0)=O,  where 

S+(0)  =  E  <|)+(Ri+(0)/ (n+1)  )sign(Xi-0)  and  Ri+(0)  is  the  rank  of  |x^-0|, 
i=l  .  ^  ^ 

i=l,  . . .  ,  n.  If  P  (S  <  -k)  »  a/2  then  [0.  ,0TT]  is  a  (1-a)  100%  confidence 

n  L  u 

°  +  ~  ~  — 

interval  for  9,  where  S  (0  )  =  k  and  S  (8  )  *  -k  define  the  end  points. 

la  U 

Define 


f  {F_i [ (u+l) /2] } 


(1.2) 


and 


<$+(u)  4>+(u,f )du. 


o 


(1.3) 


Then  the  Pitman  efficacy  of  the  test  based  on  S+  is  f  ^  (Hajek  and  Sidak  1967, 
p  220).  Hence,  the  efficiency  properties  of  S+  are  determined  by  T  Further, 

A 

the  estimate  8  has  a  normal  limiting  distribution  with  mean  6  and  variance 
2 

T  / n  (Hodges  and  Lehmann  1963).  Finally, 


T  -  (0u-0L)/(2k) 


(1.4) 


converges  in  probability  to  T  (Sen  1966).  Hence,  if  the  efficiencies  of  the 
point  estimate  and  confidence  interval  are  defined  by  their  asymptotic  variance 
and  asymptotic  length,  then  the  remarks  above  show  that  estimation  methods 
inherit  their  efficiency  properties  from  S+,  the  parent  test  statistic.  We 
now  turn  to  the  linear  model. 

Let  Y^,  ...  ,  Y^  be  independent  observations  on  a  linear  model.  We 
suppose  Y^  has  cdf  F(y-a-x.. '  8) ,  1*1,  ...  n,  where  6  is  a  pxl  vector  of  regression 
parameters,  x^'  is  the  ith  row  of  the  nxp  design  matrix  X,  a  is  an  intercept 
parameter  and  F(*)  has  density  f(*)<  We  define  a  score  function  $(•)  corresponding 
to  the  one-sample  score  function  4>+(*)  by 


4>(u)  -  -4>  (l-2u)  ,  0  <  u  <  1/2 


(2u-l)  ,  1/2  <  u  <  1. 


Then  <{>(u)du  “  0  and  0  (u)du  =  1.  Let 


n  R  (0) 

D(y-a-Xg)  -  l  c|>(—£]—)  (Y^-a-x^'3)  , 
i-1 


where  R.  (8),  •••  »R  (8)  are  the  ranks  of  Y.-a-x.'S,  ...  ,  Y  -a-x  '8.  Then 
x  n  xi  n  n 

Jaeckel  (1972)  defined  a  rank  estimate  of  8  (D  is  invariant  to  })  by  a  value 
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6  whicn  minimizes  D(y-a-X8).  The  negative  gradient  of  D  with  respect  to 

8  exists  for  almost  all  8  and  is  equal  to  the  pxl  vector  S (Y-ot-XB)with 

n 

component  S  (Y-ct-X8)  3  I  x.  .4»(R.  (8)/ (n+1)) .  An  asymptotically  equivalent 
J  i«l  iJ  1 

rank  estimate  8  may  be  defined  by  S(Y-a-X8)  3  0.  Jureckova  (1971)  and  Jaeckel 

A 

(1972)  showed  that  8  has  a  multivariate  normal  limiting  distribution  with  mean 

2  -1 

8  and  covariance  matrix  t  (X^'X^)  ,  where  X^  is  the  centered  design  matrix, 

assumed  to  have  full  rank. 

The  parameter  T  can  be  defined  in  terms  of  the  score  function  $(•)  as 


follows:  let  0(u,f)  be  derived  from  0  (u,f)  through  1.5;  then 

1  1  +  +  -1 
0(u)  0(u,f)du  =  0  (u)  0  (u,f)du  *  T 

•*0 

If  we  write  XB  3  +  X282»  where  8^  is  (p-q)xl  and  B2  Is  qxl,  then  we 

can  consider  rank-based  tests  of  H  :  8o*0,  S-,  unspecified.  Let  8,  be  the  rank 

oil  i 

A 

estimate  of  8^  when  B2=0,  (reduced-model  estimate),  and  let  8  denote  the  full- 
model  estimate  of  8.  Partition  S  into  S1  and  $2;  then  the  qxl  vector  of 

A 

aligned  rank  statistics  S^Y-a-X^) ,  under  Hq,  has  a  multivariate  normal 
limiting  distribution  with  mean  0  and  a  covariance  matrix  which  can  be  written 
using  a  similar  partitioning  of  the  centered  design  Xc: 


A  -  -  x2c,xlc<xlc'xlc)'\c'x2cl  ' 


(1.7) 


See  Adlchie  (1978)  and  Sen  and  Puri  (1977). 

Under  Hq  the  following  three  random  variables  all  have  asymptotic  chi- 
square  distributions  with  q  degrees  of  freedom: 


„-l 


S2'(Y-a-X181)A  iS2(Y-a-X181)  , 


(1.8) 


5 


(H8) ' [H(X  'X  )_1H' ]_1(H6) 

- -  ,  (1.9) 

T 


where  H  *  (0,1)  such  that  HB=82»  and 


DCY-oc-X^)  -  D(Y-ct-XB) 
T/2 


(1.10) 


See  McKean  and  Hettmansperger  (1976)  for  a  discussion  of  1.10.  Tests  based 
on  1.8,  1.9  and  1.10  have  the  same  Pitman  efficacy  and  so  cannot  be  separated 
using  asymptotic  efficiency. 

For  a  moment  suppose  t  is  known.  Then  the  asymptotic  distribution  theory 
for  1.8-1.10  does  not  require  symmetry  of  f(*).  However,  if  T  must  be  estimated 
or  if  we  wish  to  make  inferences  on  the  intercept  a,  then  one-sample  methods 

A 

seem  to  be  needed.  First  form  the  full-model  residuals  ^  ~  X^'B, 

^  A  A  A 

i=l,  ...  ,  n.  Now  apply  S  to  estimate  a  and  to  compute  T  =  (o^-o^)/ (2k) . 

McKean  and  Hettmansperger  (1976)  showed  that  T  is  consistent,  provided  f(») 
is  symmetric. 

Hence,  if  we  suppose  f(*)  is  symmetric,  then  we  can  estimate  T  consistently 
for  use  in  constructing  the  tests  based  on  1.9  and  1.10  and  for  estimating 
the  asymptotic  standard  errors  of  6.  In  the  remainder  of  the  paper  we  consider 

/A 

what  happens  to  T  when  f(«)  is  not  symmetric  and  discuss  alternative  estimates 
of  T  which  do  not  require  symmetry. 
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2 

2.  Estimation  of  /f  (x)dx 


In  this  section  we  will  consider  a  random  sample  X^ .  X^  from  a 

distribution  with  cdf  F(x).  Further,  suppose  F(*)  possesses  a  density  f(*) 


which  may  not  be  symmetric.  We  will  restrict  attention  to  Wilcoxon  scores 


defined  by  $+(u)  *  3^^u.  Then  1.3  becomes  x-^ 


12 


1/2 


f  (x)dx,  and  we 


will  concentrate  on  the  consistent  estimation  of 


f^(x)dx. 


(2.1) 


When  f(*)  is  symmetric,  the  consistency  of  x  in  1.4  is  usually  derived 
from  an  asymptotic  linearity  result  on  S+  due  to  van  Eeden  (1972).  The  result 
can  be  easily  modified  for  f(*)  with  arbitrary  shape: 


-1/2  +  .  -1/2,  .  -1/2C+ .  -1/2  .  .,  .  ,,1/2 

n  S  (n  b)  -  n  S  (n  a)  «  -(b-a)(12) 


f(-x)f(x)dx  +  o  (1),  (2.2) 

P 


where  °p(l)  tends  to  zero  in  probability  uniformly  for  all  a,b  such  that 
i b— a |  <  K,  for  any  positive  constant  K.  By  S  (6)  we  mean  S  computed  on 
X.j-0,  i*l,  . . . ,  n. 

1/2^  1/2^  +  * 

Since  n  6^  and  n  9^  are  bounded  in  probability,  where  S  (0^)  *  k  and 

S+(9y)  “  -k,  we  have  from  2.2 


+  /s  + 

S  (fy  -  S+(9l) 

121/2n(9u  -  St> 


(2.3) 


2k 


121/2n(5o-0L) 


f 


f(-x)  f(x)dx  +  o  (1)  • 
J  P 

—  OO 
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•  1/2 

Note  k  ■  from  the  normal  approximation  where  Z  ^  is  the  upper 

a/2  percentile  of  the  standard  normal  distribution.  Note  that  k  is  chosen 
under  the  assumption  of  symmetric  f(*)  and  in  general  will  not  be  the  a/2 
critical  value  of  S+.  This  does  not  effect  the  convergence  in  2.3.  From  2.3, 

~  p  . 

since  y  j»  J  f  (-x)f  (x)dx,  it  is  evident  that  the  usual  estimate  r,  1.4, 

is  no  longer  consistent.  The  amount  of  large  sample  bias  depends  on  the  size 
of  / f2(x)dx  -  / f(-x)f(x)dx. 

We  next  consider  Jaeckel's  (1971)  model  of  asymmetric  contamination. 
Suppose  for  a  given  sample  size  n  we  are  sampling  from 


G^Cx)  =  (1-cn  ^2)F(x)  +  cn  1/,2H(x) 


(2.4) 


where  F(*)  is  the  cdf  of  a  symmetric  distribution  and  H(>)  is  the  cdf  of  a 
distribution  with  arbitrary  shape.  Jaeckel  points  out:  "The  amount  of 
asymmetric  contamination  is  large  enough  to  affect  the  performance  of  the 
estimator,  'out  is  too  small  to  be  measured  accurately  at  the  given  sample  size." 

We  will  present  a  heuristic  derivation  to  suggest  that  n  (y  -  / f  (x)dx)  has 

2  2  2 
an  asymptotic  n(b,a  )  distribution  where  b  is  the  asymptotic  bias  and  b  +o 

is  the  asymptotic  mean  square  error.  Thus,  both  the  bias  and  variability  of 

the  estimator  approach  zero  at  the  same  rate. 

Using  the  heuristic  argument  of  Huber  (1969)  to  construct  the  projection 
of  the  estimator,  we  have 


n1/2(y  -  J* f 2 (x)dx)  -  2  n“1/2  £  [f(X.)  -  /f2(x)dx]  +  o  (1)  (2.5) 

i-1  1  p 


1/  2  ^  2 

Under  the  model  of  symmetry,  with  no  contamination,  n  (y  -  / f  (x)dx)  is 

3  2  2 

asymptotically  n(0,  4 { / f  (x)dx  -if f  (x)dx]  })•  For  a  rigorous  discussion  see 


Antilie  (1972,  1974). 
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To  determine  the  limiting  distribution  of  2.5  under  the  contamination 
model  2.4  we  use  the  contiguity  results  of  Hajek  and  Sidak  (1967,  Chapter  6). 
We  will  assume  sufficient  smoothness  of  the  densities  f(*)  and  h(*)  to  carry 
out  the  required  expansions.  Then  the  log-likelihood  can  be  written  as 

2/2  ^  h(X  ) 

log  L  =  cn  E  (  ttt— t-  -  1)  +  o  (1).  (2.6) 

i=l  f(Xi)  P 

1/  2  ^  2 

Now  (n  (y  -  /f  (x)dx) ,  log  L)  is  asymptotically  bivariate  normal.  We  need 
the  covariance: 


2  h  (X . ) 

b  -  2cE{[f(X.)  -  /f  (x)dx][  -  1]}  (2.7) 

*  2c[/f (x)h(x)dx  -  /f2(x)dx]  . 


The  above  results  show  that  the  densities  ITg^x^)  and  Ilf (x^  are  contiguous, 
and  the  limiting  distribution  of  n  (y  -  /f  (x)dx) ,  under  the  contamination 
model  2.4,  is  n(2c[/f (x)h(x)  -  Jf2(x)dx],  4[/f2(x)dx  -  (/f 2 (x)dx) 2] ) . 

The  expression  2.7  then  represents  the  asymptotic  bias,  which  can  be  far  from 
zero,  depending  on  the  choice  of  h(*). 

We  now  return  to  2.3  for  a  different  representation  of  the  estimator  y. 
Recall  the  counting  form  for  the  Wilcoxon  signed-rank  statistic  S+(0) : 


S+(0) 


3172 

n+1 


I(X.  +  X  >  9)  -  E  E  I(X,  +  X,  <  9), 


ifj 


j 


2(31/2)  jl  I  I (X  _+_X,  >  9)  -  SfetU 
n+1  \  i^j  2  3 


Let  T(9)  *  (n(n+l)]  ^  E  E  I(X  +X  >  29);  then  from  2.3 

i<j  J 


(2.8) 


9 


T(6U)  -  T(6l) 

Y  * - - - - - 

/u  -  0L 

tcql)  -  Toy 

V2 


(2.9) 


where  h  *  2(0,  -0  )  =  2(0,  -0),  0  =  (0T+0„)/2.  Note  that  h  — 0  and  9  >  0, 

n  U  L  u  L  u  n 

which  we  take  to  be  zero  without  loss  of  generality.  With  a  bit  of  algebra, 
we  can  write  2.9  as 


1 

n(n+l)h 

n 


-  h 

Z  I  I{|x.+X.  -  20 1  <-?-  } 
i/j  1  j  2 


+ _ 2 _ 

n(n*fl)h 

n 


z  r(|x.-  o | 

i 


(2.10) 


The  approximate  equality  is  due  to  using  absolute  values  to  represent  the 

A 

counts  in  2.8.  We  next  show  that  y  can  be  related  to  density  estimators. 

Rosenblatt  (1956)  proposed  the  following  simple,  rectangular  window 
estimator  of  f(x): 

i  h 

f  (x)  =  ~  Z  I { j x-X .  |  <  ~  }  .  (2.11) 

n  nh  '  i  1  2 

n 

For  large  n,  y  is  essentially  the  same  as 

y*  =»  Z  Z  I{  |  X  +X  |  <  }  (2.12) 

“\I«  3 

-  [  f„<-x><iFn<*)  , 

—  CD 

where  F  (x)  is  the  empirical  cdf.  This  representation  suggests  two  points: 
n 

2 

(1)  To  estimate  / f  (x)dx  consistently,  we  should  consider  pairwise  differences 
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rather  than  pairwise  sums  or  averages ,  and  (2)  it  may  be  advantageous  to 
use  density  estimation  directly.  In  the  next  section  we  discuss  density 
estimation  of  /f  (x)dx.  For  estimates  based  on  the  differences  the  reader 
is  referred  to  Sievers  (1982). 

2 

3.  Density  Estimates  of  / f  (x)dx 


As  suggested  by  2.12,  consistent  estimation  of  y  in  2.1,  for  f(*)  with 
arbitrary  shape,  may  be  achieved  by  using  density  estimates.  Suppose  that 
W( * )  is  a  given  density,  symmetric  about  0.  Let 


fn(x)  =  5T 


I  we¬ 
ll  i*=l 


x-X. 
i 


)  • 


Then  fQ(x)  is  called  a  window  estimate  of  f (x) ,  W(*)  is  called  the  window,  and 

h  +  0  is  called  the  window  width.  These  estimates  were  first  discussed  bv 
n 

Rosenblatt  (1956)  and  Parzen  (1962).  See  Wegman  (1972)  and  Bean  and  Tsokas 

(1980,  1982)  for  reviews  of  density  estimation.  The  estimate  2.11  is  an 

example,  with  W(»)  taken  as  the  uniform  density  on  (-1,  1). 

2 

Bhattacharyya  and  Roussas  (1969)  considered  / f^  (x)dx,  and  Cheng  and 

Serfling  (1981)  considered  the  more  general  f<t>(x)ip( F(x)f^  (x)dx,  where  F(x) 

is  the  integral  of  f^(x).  On  the  other  hand,  Schuster  (1974)  and  Ahmad  (1976) 

studied  /f  (x)dF  (x) ,  where  F  (x)  is  the  empirical  cdf.  Schweder  (1975) 
n  n  n 

considered  /^(Fn(x) )fn(x)dFn(x).  The  more  general  form  of  each  approach  is 

appropriate  for  estimation  of  t  ^  in  1.3.  Again,  we  restrict  attention  to  the 

special  case  of  y,  2.1  The  works  cited  above  contain  results  on  the  weak  and 

strong  consistency  and  weak  convergence  of  the  various  estimates.  The 

2 

following  theorem  relates  the  two  estimators,  / f  (x)  and  / f  (x)dF  (x) . 

n  n  n 


J 
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Theorem.  Let  f  (•)  be  given  fay  3.1.  Then  ■  /f n  (xJdF^Cx), 

*  * 

where  f  (x)  is  defined  by  3.1  using  W  (x)  =  W*W(x),  the  convolution  density. 
Proof.  We  have 


f  (x)dx 
n 


_  n  n  X  -x  X,-x 

(nh  I  Z  W (— r — )W (— ~ — )  dx. 
n  i  i  i  i  n  n 

i*l  j»l  n  n 


Let  z  =•  (X-x)/h  ;  then 
J  n 


n  n  r  (X.-X.) 


2  _2  -1  “  '•"i 

:  (x)dx  =  n  h  I  Z  W(z  +  — — )W(2)d 

n  n  *1111  n 

i*l  j*l  ;  a 


-2r_  -1  “  “  „*,Xi"X^ 

-  n  a  L  E  W  ( — r—^)  , 
n  .  ,  .  ,  h 
i»l  j=l  n 


* 

where  W  (2)  is  the  convolution  of  W(z)  with  W(-z).  But  since  W(#)  is  symmetric 

*  *  * 
about  0,  W  (z)  »  W*W(z).  Now  let  f  (x)  be  given  by  3.1,  with  window  W  (•)  ; 

r  2  ,  * 

then  Jf  (x)dx  =  Jf  (x)dF  (x) ,  where  F  (•)  is  the  empirical  cdf. 

'  n  4  n  n  n 

Since  not  every  density  can  be  written  as  the  convolution  of  some  density 

with  itself,  we  cannot,  in  general,  reverse  the  argument.  Further  the 

relationship  seems  only  to  hold  for  the  case  of  Wilcoxon  scores. 

In  the  following  discussion  we  present  still  another  motivation  for  using 

2 

jf  (x)dF  (x)  as  a  natural  estimate  of  /f  (x)dx.  Note  that 
n  n 


P(X1-X2  <  q)  = 


y+q 

dF(x)dF(y) . 


(3.2) 


Define  the  functional  T(F)  by 


T(F) 


r  y+q 

dF(x)dF(y) 


00 

f^(x)dx  . 

- 

—  00 


(3.3) 


r 


This  suggests  the  estimate  T(Fn),  which  can  be  approximated  by 


-1  °°  f 

Fn}  ”  q 

n  _co  _a 


dF  (x)dF  (y) 
n  n 


f  y 

dF 

n 


(x)dF  (y)  ,  (3.4) 

n 


where  q  ■*  0  (q  >0).  This  reduces  to 
nn 


T(Fn) 


(3.5) 


(2q  n2)_1S  I  I{|X.-X.|  <  q  } 
n  Lflj  13  n 


fn(x)dFn(x)  , 


with  h  »  2q  .  The  approximation  arises  because  the  i=j  terms  are  not  present 
n  n 

on  the  left-hand  side  and  we  have  <  in  the  indicator. 

Let  F  (x)  =  (l-e)F(x)  +  £6  (x) ,  where  6  (x)  is  the  cdf  which  puts  mass  1 
C  2  Z 

at  z.  Then  Hampel's  (1974)  influence  curve  for  the  functional  T(F)  is  defined  as 


IC(z)  *  lim 
£-*0 


T(F£)-T(F) 


(3.6) 


From  3.3  we  have 


T(FJ  =  ^  [  (l-e)F  (y+q)  +  s5z(y+q)  ]d[  (l-e)F(y)  +  e5z(y)] 


2  2 

=  (1-e)  f  (x)dx  +  2e(l-e)f(z). 


Hence, 


IC(z)  «  2 (f (z)  -  / f  (x)dx) , 


and  the  influence  is  bounded  provided  the  density  f(*)  is  bounded. 
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Following  the  heuristic  argument  of  Huber  (1981,  p  14)  the  functional 
T(Fn),  expanded  about  F,  yields 

n1/2[T(F  )  -  / f 2(x)dx]  -  2  n"1/2  Z  [f(X.)  -  / f2(x)dx]  +  R  .  (3.8) 

n  i-1  1  n 

p 

Provided  - >  0,  the  leading  term  in  3.8  shows  that  we  can  anticipate,  from 

1/2  2 

the  Central  Limit  Theorem,  chat  n1  lT(F  )  -  / f  (x)dx]  is  asymptotically 

n 

3  2  2 

n(0,  4{/f  (x)dx  —  t/f  (x)dx]  }).  See  the  references  at  the  beginning  of  this 
section  for  a  rigorous  development.  Hence,  we  have  the  same  limiting  distri¬ 
bution  for  the  density  type  estimate  for  the  case  of  arbitrary  shape  as  for 

A 

Y  in  the  symmetric  case  described  by  2.5. 

Anticipating  the  estimation  problems  in  the  linear  model  raised  in  the 

Introduction,  we  discuss  the  estimation  of  y  in  the  two-sample  location  model. 

This  model  is  the  simplest  linear  model  and  provides  a  framework  for  comparing 

the  confidence  interval  approach  to  the  density  estimation  approach. 

Suppose  that  we  observe  Y,  ,  ...  Y  ,  with  Y- ,  ...  Y  and  Y  ..-8,  ...  Y  -6 
r  1  n  1  m  m+1  n 

all  iid  with  continuous  cdf  F(’).  We  suppose,  without  loss  of  generality, 
that  F(0)  =  1/2.  This  corresponds  to  a  linear  model  with  a=0  and  X  an  nxl 

vector  of  m  zeros  and  n-m  ones,  so  that  Y  *  XB+e  as  described  in  the  Introduction. 

+  1/2  1/2 
Further,  with  <J>  (u)  »  3  '  u,  from  1.5  we  have  (u)  *  (12)  (u-1/2),  the 

two-sample  Wilcoxon  score-function.  The  gradient  of  D  in  1.6  suggests  the 

Mann-Whitnev-Wilcoxon  statistic : 


S  (B) 


(12) 


1/2 


n+1 


n 

2  (R1<6) 

i“nrt-l 


-  (n+1 ) / 2 ] , 


(3.9) 


where  R,  (3),  ...  ,  R  (8)  are  the  ranks  of  Y. ,  ...  Y  ,  Y  .-3,  ...  Y  -3. 

I  n  i  m  m+i  n 

(We  will  treat  Y, ,  ...  Y  as  the  first  sample  and  Y  . . ,  ...  Y  as  the  second). 
1  m  m+i  n 
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The  random  variable  S(8)  can  be  written  in  terms  of  counts  as 


.1/2  m  n 


~ —  E  E  {l(Y  -Y  >  6)  -  1/2}  . 
n+1  i-1  j-rrt-1  i  1 


(3.10) 


Suppose  P(S(0)  <  -k)  =  a/2.  Then  S/S^)  -  k  and  S(Bjj)  -  -k  define  [BL>By], 
a  (1-a)  100%  confidence  interval  for  S.  Suppose  m/n  •*  X,  0  <  A  <  1;  then 
Lehmann  (1963)  showed  that 


[12  \(l-X)]A/^[m(n-m)]-L/^(Bu-8L)  12  '  ‘(n+l)A/ ‘(8^) 


(3.11) 


P.  r.2 


/ f  (x)dx. 


where  k  -  za/2  (m(n-m)/ (n+1) )  from  the  normal  approximation,  similar  to  2.3. 

Now  let  h  -  B..-S.  and  B  =  (8TT+8T)/2.  Clearly,  h  0  and  8  -^*•8. 

n  u  l  u  l.  n 

1/2  ~ 

Since  n  (6-3)  is  bounded  in  probability,  using  3.10  we  can  rewrite  3.11  as 


A  -  IB  n  A  A 

Yt  -  [m(n-m)h  }~L  E  E  {l(Y.-Y.  >  8.)  -  I(Y.-Y.  >  8..)} 

L  n  i-1  j-ari-1  J  1  L  1  1  U 


(3.12) 


[m(n-m)h  ]  E  E  I(|Y.-8-Y  |  <  h  /2)  . 
n  i-1  j=nH-l  J  n 


Thus  Yj^  is  like  a  window  estimate  computed  on  the  residuals  after  fitting  the 


linear  model.  In  fact,  we  can  write 


Yt  -  /f(x)dF  (x) 
l  '  m  n-m 


(3.13) 


where  f  (•)  is  a  rectangular  window  estimate  of  the  density  based  on  the  first 
tn 

sample  and  F  (•)  is  the  empirical  cdf  based  on  the  residuals  of  the  second 
n-m 
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sample.  Similarly, 


A 

y  -  /f  (x)dF  (x),  (3.14) 

l  4  ti— in  in 

where  the  density  estimate  is  computed  on  residuals  from  the  second  sample  and 
F  (•)  is  the  empirical  cdf  of  the  first  sample. 

A 

Let  ri  *  i  *  1»  •••  m  and  *  Y  -  S,  i  ■  m+1,  ...  ,  n  denote  the 

residuals.  Then  the  density  estimate  of  y  based  on  r^,  ...  r^  is  f  f^(x)dF^(x) , 
which  we  can  write  as 


Y  -  [n2h  l"1  E  E  I(|r  -r  |  <  h  /2)  (3.15) 

i=l  j-1  J  1  n 


-  [n2hl  1 [2  E  E  I{ 1 Y  -8-Y  |  <  h  /2} 
i*l  j-nrt-1  J  n 

+  Z  Z  l{ | Y ,-Y  |  <  h  /2} 
i-1  j-1  J  1  n 

n  n 

+  E  E  I{ (Y  -Y  |  <  h  /2} ] . 
i»m+l  j«nrfl  ^  n 


Hence,  from  3.12, 


Y 


2  m(n-m) 

2 

n 


f  (x)dF  (x) 
tn  m 


+ 


(n-m) ' 


ff 

J  n-m 


(x)d  F  (x) 


n-m 


(3.16) 


The  density  estimate  y,  constructed  from  the  residuals,  is  a  weighted  sum  of 

A 

three  estimates:  yL  (the  confidence  interval  estimate),  and  two  separate 
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2 

density  estimates  of  Jr  (x)dx  based  on  the  two  samples.  It  would  appear  that 

^  A 

Y  uses  more  information  from  the  data  than  yl-  In  the  final  section  we  discuss 
briefly  the  extension  to  the  general  linear  model. 

4.  Extensions  to  the  Linear  Model 

We  again  consider  the  comments  in  the  last  paragraph  of  the  Introduction. 

In  order  to  construct  tests  based  on  1.9  and  1.10,  without  making  the  assumption 
of  symmetry  of  f(’),  we  propose  to  use  the  ideas  suggested  in  Section  3.  In 

A 

particular,  for  Wilcoxon  scores,  we  would  use  y  defined  in  3.15,  where 

As  As 

r ^  31  Yj-x^'6  i*l,  ...  ,  n  and  8  is  the  full-model  rank-estimate  proposed  by 
Jaeckel  and  Jureckova. 

Since  r^,  ...  ,  r^  are  neither  independent  nor  identically  distributed 

A 

the  consistency  of  y  in  3.15  does  not  follow  from  the  results  cited  in  the 
previous  sections.  In  the  case  of  Wilcoxon  scores,  Aubuchon  (1982)  proved 

✓v 

consistency  of  y  and  studied  its  behavior.  Further  discussion,  with  an 
application  to  a  data  set,  can  be  found  in  Aubuchon  and  Hettmansperger  (1982). 

In  1983  a  rank-regression  command  will  be  available  in  the  Minitab  statistical 
computing  system.  The  output  will  contain  the  rank  estimates  of  Jaeckel  and 
Jureckova,  and  the  tests  1.9  and  1.10  using  the  density  estimate. 

In  a  designed  experiment  or  a  regression  model  with  replicates,  the 
estimate  yl»  3.11,  based  on  the  confidence  interval  of  a  shift  parameter 
between  groups  of  replicates,  does  not  require  symmetry  of  f(*).  A  final 
estimate  of  y  is  constructed  by  pooling  the  individual  estimates  formed  from 
pairs  of  replicate  groups.  Draper  (1981)  studied  these  estimates  in  detail. 

A> 

However,  Y  in  3.16  suggests  that  a  density  estimate  formed  from  all  the 
residuals  may  be  more  informative.  No  careful  comparison  of  the  two  approaches 
has  been  carried  out  yet. 
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